Scheduling Large Task Graphs in Parallel Using a Fault-Tolerant Heterogeneous-Cluster-Based Search
نویسنده
چکیده
—A natural approach for scheduling tasks to a workstation cluster is to employ the multiple machines in the cluster to schedule the task graphs so that the cluster manifests itself as a “self-scheduled” platform. A few parallel approaches have been devised for scheduling task graphs using a parallel machine such as an Intel Paragon but they are not suitable for a cluster of workstations environment because the machines in a cluster are heterogeneous and may experience failure. In this paper, we propose a new approach, called Cluster-Based Search (CBS), for scheduling large task graphs in parallel on a heterogeneous cluster of workstations. The CBS algorithm uses a parallel random neighborhood search which works by refining multiple different initial schedules simultaneously using different workstations. Heterogeneity of machines is handled in the biased partitioning of the search space. The parallel random neighborhood search is fault-tolerant in that the workload of a failed workstation is automatically redistributed to other workstations so that the search can continue. Our performance evaluation indicates that the CBS approach can generate high quality solutions and is scalable.
منابع مشابه
An Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ
An efficient assignment and scheduling of tasks is one of the key elements in effective utilization of heterogeneous multiprocessor systems. The task scheduling problem has been proven to be NP-hard is the reason why we used meta-heuristic methods for finding a suboptimal schedule. In this paper we proposed a new approach using TRIZ (specially 40 inventive principles). The basic idea of thi...
متن کاملAn Efficient Genetic Algorithm for Task Scheduling on Heterogeneous Computing Systems Based on TRIZ
An efficient assignment and scheduling of tasks is one of the key elements in effective utilization of heterogeneous multiprocessor systems. The task scheduling problem has been proven to be NP-hard is the reason why we used meta-heuristic methods for finding a suboptimal schedule. In this paper we proposed a new approach using TRIZ (specially 40 inventive principles). The basic idea of thi...
متن کاملA new Shuffled Genetic-based Task Scheduling Algorithm in Heterogeneous Distributed Systems
Distributed systems such as Grid- and Cloud Computing provision web services to their users in all of the world. One of the most important concerns which service providers encounter is to handle total cost of ownership (TCO). The large part of TCO is related to power consumption due to inefficient resource management. Task scheduling module as a key component can has drastic impact on both user...
متن کاملImproving Concurrent Write Scheme in File Server Group
A comparative performance study of distributed mutual exclusion algorithms with a class of extended petri nets p. 11 A practical comparison of cluster operating systems implementing sequential and transactional consistency p. 23 Clock synchronization state graphs based on clock precision differences p. 34 A recursive-adjustment co-allocation scheme in data grid environments p. 40 Reducing the b...
متن کاملStability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999